Algorithms for CVaR Optimization in MDPs

نویسندگان

  • Yinlam Chow
  • Mohammad Ghavamzadeh
چکیده

In many sequential decision-making problems we may want to manage risk by minimizing some measure of variability in costs in addition to minimizing a standard criterion. Conditional value-at-risk (CVaR) is a relatively new risk measure that addresses some of the shortcomings of the well-known variance-related risk measures, and because of its computational efficiencies has gained popularity in finance and operations research. In this paper, we consider the mean-CVaR optimization problem in MDPs. We first derive a formula for computing the gradient of this risk-sensitive objective function. We then devise policy gradient and actor-critic algorithms that each uses a specific method to estimate this gradient and updates the policy parameters in the descent direction. We establish the convergence of our algorithms to locally risk-sensitive optimal policies. Finally, we demonstrate the usefulness of our algorithms in an optimal stopping problem.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Risk-Sensitive and Robust Decision-Making: a CVaR Optimization Approach

In this paper we address the problem of decision making within a Markov de-cision process (MDP) framework where risk and modeling errors are taken intoaccount. Our approach is to minimize a risk-sensitive conditional-value-at-risk(CVaR) objective, as opposed to a standard risk-neutral expectation. We refer tosuch problem as CVaR MDP. Our first contribution is to show that a CVaR...

متن کامل

Policy Gradients for CVaR-Constrained MDPs

We study a risk-constrained version of the stochastic shortest path (SSP) problem, where the risk measure considered is Conditional Value-at-Risk (CVaR). We propose two algorithms that obtain a locally risk-optimal policy by employing four tools: stochastic approximation, mini batches, policy gradients and importance sampling. Both the algorithms incorporate a CVaR estimation procedure, along t...

متن کامل

Risk-Constrained Reinforcement Learning with Percentile Risk Criteria

In many sequential decision-making problems one is interested in minimizing an expected cumulative cost while taking into account risk, i.e., increased awareness of events of small probability and high consequences. Accordingly, the objective of this paper is to present efficient reinforcement learning algorithms for risk-constrained Markov decision processes (MDPs), where risk is represented v...

متن کامل

مقایسه پارامتریک مرزهای کارایی مدل های مدیریت ریسک مارکویتز، ارزش در معرض ریسک و ارزش در معرض ریسک احتمالی با استفاده از الگوریتم بهینه سازی تبرید شبیه سازی شده در بورس اوراق بهادار تهران

Nowadays risk management is as vital as gaining the maximum return. Therefore, researches in risk management area and its different models are very useful for the investors. Using a local (fmincon function) and a global optimization (simulated annealing) algorithms based on three risk management models namely Markowitz, Value at Risk (VaR) and Conditional Value at Risk (CVaR), this research see...

متن کامل

Robust Portfolio Optimization with risk measure CVAR under MGH distribution in DEA models

Financial returns exhibit stylized facts such as leptokurtosis, skewness and heavy-tailness. Regarding this behavior, in this paper, we apply multivariate generalized hyperbolic (mGH) distribution for portfolio modeling and performance evaluation, using conditional value at risk (CVaR) as a risk measure and allocating best weights for portfolio selection. Moreover, a robust portfolio optimizati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014